A Uyghur Morpheme Analysis Method based on Conditional Random Fields

نویسندگان

Batuer Aisha

Maosong Sun

چکیده

Morpheme analysis is very important for Uyghur language processing. Morpheme analysis of Uyghur is quite different from other language, for this task the keys include feature selection and the design of a morpheme annotated corpus . In this paper we propose a new statistical-based Uyghur morpheme analysis method by using Conditional Random Fields (CRFs) model. The preliminary experiment results demonstrate that the proposed method is effective;the F-measure of morpheme analysis reaches 87% in the open test.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bidirectional Long Short-Term Memory Network with a Conditional Random Field Layer for Uyghur Part-Of-Speech Tagging

Uyghur is an agglutinative and a morphologically rich language; natural language processing tasks in Uyghur can be a challenge. Word morphology is important in Uyghur part-of-speech (POS) tagging. However, POS tagging performance suffers from error propagation of morphological analyzers. To address this problem, we propose a few models for POS tagging: conditional random fields (CRF), long shor...

متن کامل

Uyghur Short Text Classification Using Morphological Information

In this paper, we propose a novel method for improving the classification performance of short text strings using conditional random fields (CRFs) that combine morphological information. Experimental results on three datasets (Uyghur, Chinese, and English) demonstrate that our method can yield higher classification accuracy than Support Vector Machine (SVM) classifier and Maximum Entropy Model ...

متن کامل

Log-linear Models for Uyghur Segmentation in Spoken Language Translation

To alleviate data sparsity in spoken Uyghur machine translation, we proposed a log-linear based morphological segmentation approach. Instead of learning model only from monolingual annotated corpus, this approach optimizes Uyghur segmentation for spoken translation based on both bilingual and monolingual corpus. Our approach relies on several features such as traditional conditional random fiel...

متن کامل

Cross-lingual Word Segmentation and Morpheme Segmentation as Sequence Labelling

This paper presents our segmentation system developed for the MLP 2017 shared tasks on cross-lingual word segmentation and morpheme segmentation. We model both word and morpheme segmentation as character-level sequence labelling tasks. The prevalent bidirectional recurrent neural network with conditional random fields as the output interface is adapted as the baseline system, which is further i...

متن کامل

Conditional Random Fields for Airborne Lidar Point Cloud Classification in Urban Area

Over the past decades, urban growth has been known as a worldwide phenomenon that includes widening process and expanding pattern. While the cities are changing rapidly, their quantitative analysis as well as decision making in urban planning can benefit from two-dimensional (2D) and three-dimensional (3D) digital models. The recent developments in imaging and non-imaging sensor technologies, s...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

Int. J. of Asian Lang. Proc.

دوره 19 شماره

صفحات -

تاریخ انتشار 2009

A Uyghur Morpheme Analysis Method based on Conditional Random Fields

نویسندگان

چکیده

منابع مشابه

Bidirectional Long Short-Term Memory Network with a Conditional Random Field Layer for Uyghur Part-Of-Speech Tagging

Uyghur Short Text Classification Using Morphological Information

Log-linear Models for Uyghur Segmentation in Spoken Language Translation

Cross-lingual Word Segmentation and Morpheme Segmentation as Sequence Labelling

Conditional Random Fields for Airborne Lidar Point Cloud Classification in Urban Area

عنوان ژورنال:

اشتراک گذاری